COURSERA CAPSTONE PROJECT: THE BATTLE OF NEIGHBORHOODS

Singapore Town Location Analysis – A Relocation Guide

A. Introduction

Singapore is one of the preferred destinations to do business in Asia, with its economy ranked as the world’s most competitive economy, based on the latest 2019 World Economic Forum Global Competitiveness Report. Favorable factors contributing to Singapore’s economic performance include an open economy, strong labor-employer relations, diverse cosmopolitan workforce, as well as government stability and responsiveness to change.

Singapore also offers the best quality of life in Asia, based on Mercer’s 2019 Quality of Living Survey, which considers factors like political stability, healthcare, education, crime, recreation and transport. People have been attracted to this cosmopolitan island state, due to its vibrant economy, low personal income taxes, cultural diversity and high quality of living. Presently, the immigrant population in Singapore number 2.16 million, and makes up ~40% of the total population of ~5.7 million people.

However, cost of living is a concern, with Singapore being rated as the world’s most expensive city by the Economist Intelligence Unit’s 2018 Worldwide Cost of Living Report. Furthermore, Singapore has a large population size for its size, with ~8,000 people per km2. This makes Singapore 230 times denser than the United States, and more than 2,500 times denser than Australia.

As such, the goal of this analysis is to identify the most livable neighborhoods in Singapore for individuals looking to relocate to Singapore and those considering moving within Singapore. For the purpose of this exercise, we will define the most livable neighborhoods as having: (i) an affordable median rental price, (ii) a tolerable population density, (iii) a balanced mix of amenities in the neighborhood, and lastly (iv) a wide selection of good food options nearby.

B. Data Description

This analysis will require the use of the following data sources:

Singapore Median Rent by Town and Flat Type
Data on Singapore towns and corresponding median rental prices by town and flat type will be retrieved from Data.gov.sg (https://data.gov.sg), the government’s one-stop access portal to publicly available datasets. Since Median Rent by Town and Flat Type data covers information from April 1, 2005 to December 31, 2019 on a quarterly basis, we will be using 2019-Q4 data for this analysis, as this is the most recent dataset. To simplify the analysis, the average rental price for each town will be determined by the median rental price for 4-room flat types in that town, as it is available as a benchmark across almost all towns.

Singapore Population Density by Town
Data on Singapore’s population density by town will be obtained by scraping data from the Wikipedia page on ‘Planning Areas of Singapore’ (https://en.wikipedia.org/wiki/Planning_Areas_of_Singapore), which contains data on town name, region, area (km2), population and density (/km2). Population density (people per km2) is a measure of the degree of 'crowding' of the town and is calculated by dividing the town’s population by total area of town.

Singapore Town Location Data
Singapore’s geospatial data will be retrieved from Data.gov.sg (https://data.gov.sg). Master Plan 2019 Planning Area Boundary (No Sea) data provides indicative polygons of planning area boundary, and this GeoJSON data on Singapore’s planning areas will enable visualization on maps. In parallel, geographic coordinates of town centers will be retrieved using Google Maps, with coordinates of MRT stations being used as the center for all towns for the purpose of this analysis.

Singapore Venue Information from Foursquare API
Foursquare API (https://foursquare.com/) will be used to explore the neighborhoods of each town. Using Foursquare API, we will understand the various venues in each neighborhood, to assess if there are a balanced mix of amenities and to determine the most common venue categories. In addition, we will also be using Foursquare API to retrieve venue ratings for each location. However, as venue ratings are a premium endpoint, we are limited to only 50 premium calls per day with a Personal account on Foursquare API. In view of this constraint, we will limit the analysis of ratings to only food venues, since this is where ratings will likely matter more.

C. Methodology

Import Required Libraries

In [1]:
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np # library to handle data in a vectorized manner

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
import seaborn as sns
%matplotlib inline
mpl.style.use('ggplot')

# import StandardScaler for normalizing data
from sklearn.preprocessing import StandardScaler

# import k-means fpr clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')
Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries imported.

1. Singapore Median Rent by Town and Flat Type

Extract data from zip file

In [2]:
import zipfile

!wget -q -O 'median-rent-by-town-and-flat-type.zip' "https://data.gov.sg/dataset/b35046dc-7428-4cff-968d-ef4c3e9e6c99/download"
zf = zipfile.ZipFile('./median-rent-by-town-and-flat-type.zip')
sg_median_rent_raw = pd.read_csv(zf.open("median-rent-by-town-and-flat-type.csv"))
sg_median_rent_raw.head()
Out[2]:
quarter town flat_type median_rent
0 2005-Q2 ANG MO KIO 1-RM na
1 2005-Q2 ANG MO KIO 2-RM na
2 2005-Q2 ANG MO KIO 3-RM 800
3 2005-Q2 ANG MO KIO 4-RM 950
4 2005-Q2 ANG MO KIO 5-RM -

Data clean-up

  • Drop/ignore cells with missing data
  • Use only 2019-Q4 data
  • Consider only 4-room flat type
  • Fix data types
In [3]:
# Drop rows with rental price = 'na'.
sg_median_rent=sg_median_rent_raw[~sg_median_rent_raw['median_rent'].isin(['-','na'])]
sg_median_rent.head()
Out[3]:
quarter town flat_type median_rent
2 2005-Q2 ANG MO KIO 3-RM 800
3 2005-Q2 ANG MO KIO 4-RM 950
8 2005-Q2 BEDOK 3-RM 800
9 2005-Q2 BEDOK 4-RM 900
10 2005-Q2 BEDOK 5-RM 1025
In [4]:
# Retain only 2019-Q4 data, as it is most current
sg_median_rent=sg_median_rent[sg_median_rent['quarter'] == "2019-Q4"]
sg_median_rent.head()
Out[4]:
quarter town flat_type median_rent
9095 2019-Q4 ANG MO KIO 3-RM 1700
9096 2019-Q4 ANG MO KIO 4-RM 2100
9097 2019-Q4 ANG MO KIO 5-RM 2300
9101 2019-Q4 BEDOK 3-RM 1700
9102 2019-Q4 BEDOK 4-RM 2000
In [5]:
# Consider only 4-RM flat type, as it is available across almost all towns
sg_median_rent=sg_median_rent[sg_median_rent['flat_type'] == "4-RM"]

# Reset index, because multiple rows dropped
sg_median_rent.reset_index(drop=True, inplace=True)

sg_median_rent
Out[5]:
quarter town flat_type median_rent
0 2019-Q4 ANG MO KIO 4-RM 2100
1 2019-Q4 BEDOK 4-RM 2000
2 2019-Q4 BISHAN 4-RM 2250
3 2019-Q4 BUKIT BATOK 4-RM 1900
4 2019-Q4 BUKIT MERAH 4-RM 2500
5 2019-Q4 BUKIT PANJANG 4-RM 1700
6 2019-Q4 CENTRAL 4-RM 2550
7 2019-Q4 CHOA CHU KANG 4-RM 1800
8 2019-Q4 CLEMENTI 4-RM 2400
9 2019-Q4 GEYLANG 4-RM 2250
10 2019-Q4 HOUGANG 4-RM 1900
11 2019-Q4 JURONG EAST 4-RM 2000
12 2019-Q4 JURONG WEST 4-RM 1950
13 2019-Q4 KALLANG/WHAMPOA 4-RM 2400
14 2019-Q4 MARINE PARADE 4-RM 2100
15 2019-Q4 PASIR RIS 4-RM 2000
16 2019-Q4 PUNGGOL 4-RM 1900
17 2019-Q4 QUEENSTOWN 4-RM 2600
18 2019-Q4 SEMBAWANG 4-RM 1800
19 2019-Q4 SENGKANG 4-RM 1900
20 2019-Q4 SERANGOON 4-RM 2100
21 2019-Q4 TAMPINES 4-RM 2050
22 2019-Q4 TOA PAYOH 4-RM 2300
23 2019-Q4 WOODLANDS 4-RM 1700
24 2019-Q4 YISHUN 4-RM 1800
In [6]:
# Replace CENTRAL to OUTRAM, KALLANG/WHAMPOA to KALLANG in 'town'
sg_median_rent['town']= sg_median_rent['town'].replace('CENTRAL', 'OUTRAM') 
sg_median_rent['town']= sg_median_rent['town'].replace('KALLANG/WHAMPOA', 'KALLANG')
sg_median_rent
Out[6]:
quarter town flat_type median_rent
0 2019-Q4 ANG MO KIO 4-RM 2100
1 2019-Q4 BEDOK 4-RM 2000
2 2019-Q4 BISHAN 4-RM 2250
3 2019-Q4 BUKIT BATOK 4-RM 1900
4 2019-Q4 BUKIT MERAH 4-RM 2500
5 2019-Q4 BUKIT PANJANG 4-RM 1700
6 2019-Q4 OUTRAM 4-RM 2550
7 2019-Q4 CHOA CHU KANG 4-RM 1800
8 2019-Q4 CLEMENTI 4-RM 2400
9 2019-Q4 GEYLANG 4-RM 2250
10 2019-Q4 HOUGANG 4-RM 1900
11 2019-Q4 JURONG EAST 4-RM 2000
12 2019-Q4 JURONG WEST 4-RM 1950
13 2019-Q4 KALLANG 4-RM 2400
14 2019-Q4 MARINE PARADE 4-RM 2100
15 2019-Q4 PASIR RIS 4-RM 2000
16 2019-Q4 PUNGGOL 4-RM 1900
17 2019-Q4 QUEENSTOWN 4-RM 2600
18 2019-Q4 SEMBAWANG 4-RM 1800
19 2019-Q4 SENGKANG 4-RM 1900
20 2019-Q4 SERANGOON 4-RM 2100
21 2019-Q4 TAMPINES 4-RM 2050
22 2019-Q4 TOA PAYOH 4-RM 2300
23 2019-Q4 WOODLANDS 4-RM 1700
24 2019-Q4 YISHUN 4-RM 1800
In [7]:
# Check data types
sg_median_rent.dtypes
Out[7]:
quarter        object
town           object
flat_type      object
median_rent    object
dtype: object
In [8]:
# Convert median rent to float64
sg_median_rent['median_rent']=sg_median_rent['median_rent'].astype(np.float64)
sg_median_rent.head()
Out[8]:
quarter town flat_type median_rent
0 2019-Q4 ANG MO KIO 4-RM 2100.0
1 2019-Q4 BEDOK 4-RM 2000.0
2 2019-Q4 BISHAN 4-RM 2250.0
3 2019-Q4 BUKIT BATOK 4-RM 1900.0
4 2019-Q4 BUKIT MERAH 4-RM 2500.0
In [9]:
# drop columns 'quarter', 'flat_type' from 'sg_population_density' dataframe
sg_median_rent.drop(['quarter', 'flat_type'], axis = 1, inplace=True)
sg_median_rent.head()
Out[9]:
town median_rent
0 ANG MO KIO 2100.0
1 BEDOK 2000.0
2 BISHAN 2250.0
3 BUKIT BATOK 1900.0
4 BUKIT MERAH 2500.0
In [10]:
# rename columns 'town' to 'Town', 'median_rent' to 'Median Rent (SGD/month)'
sg_median_rent.rename(columns={'town': 'Town', 'median_rent': 'Median Rent (SGD/month)'}, inplace=True)
sg_median_rent.head()
Out[10]:
Town Median Rent (SGD/month)
0 ANG MO KIO 2100.0
1 BEDOK 2000.0
2 BISHAN 2250.0
3 BUKIT BATOK 1900.0
4 BUKIT MERAH 2500.0
In [11]:
# sort by Median Rent
sg_median_rent_graph = sg_median_rent.sort_values('Median Rent (SGD/month)', ascending=True)
sg_median_rent_graph.head()
Out[11]:
Town Median Rent (SGD/month)
5 BUKIT PANJANG 1700.0
23 WOODLANDS 1700.0
24 YISHUN 1800.0
18 SEMBAWANG 1800.0
7 CHOA CHU KANG 1800.0
In [12]:
sg_median_rent_graph.shape
Out[12]:
(25, 2)
In [13]:
# plot horizontal bar chart for Median Rent by Town
sg_median_rent_graph.plot(kind='barh', figsize=(12,8), color='steelblue')
plt.xlabel('Median Rent (SGD/month)')
plt.yticks(range(25), sg_median_rent_graph['Town'])
plt.title('Median Rent for 4-Room Flat by Town in Singapore for Q4 2019')

plt.show()
In [14]:
# plot histogram for Median Rent
sg_median_rent_graph.plot(kind='hist', figsize=(12,8))
plt.xlabel('Median Rent (SGD/month)')
plt.ylabel('Number of Towns in Singapore')
plt.title('Histogram of Median Rent for 4-Room Flat in Singapore for Q4 2019')

plt.show()

2. Singapore Population Density by Town

Web scraping

In [15]:
# scrape table info from Wikipedia page and put into pandas dataframe
sg_population_density_raw = pd.read_html('https://en.wikipedia.org/wiki/Planning_Areas_of_Singapore', header=0)[2]
sg_population_density_raw.head()
Out[15]:
Name (English) Malay Chinese Pinyin Tamil Region Area (km2) Population[7] Density (/km2)
0 Ang Mo Kio NaN 宏茂桥 Hóng mào qiáo ஆங் மோ கியோ North-East 13.94 163950 13400
1 Bedok * 勿洛 Wù luò பிடோக் East 21.69 279380 13000
2 Bishan NaN 碧山 Bì shān பீஷான் Central 7.62 88010 12000
3 Boon Lay NaN 文礼 Wén lǐ பூன் லே West 8.23 30 3.6
4 Bukit Batok * 武吉巴督 Wǔjí bā dū புக்கிட் பாத்தோக் West 11.13 153740 14000
In [16]:
# Check data types
sg_population_density_raw.dtypes
Out[16]:
Name (English)     object
Malay              object
Chinese            object
Pinyin             object
Tamil              object
Region             object
Area (km2)        float64
Population[7]      object
Density (/km2)     object
dtype: object

Data clean-up & wrangling

  • Drop unnecessary columns from data
  • Rename columns
  • Calculate population density
In [17]:
# drop columns 'Malay', 'Chinese', 'Pinyin', 'Tamil' from 'sg_population_density_raw' dataframe
sg_population_density_raw.drop(['Malay', 'Chinese', 'Pinyin', 'Tamil'], axis = 1, inplace=True)
sg_population_density_raw.head()
Out[17]:
Name (English) Region Area (km2) Population[7] Density (/km2)
0 Ang Mo Kio North-East 13.94 163950 13400
1 Bedok East 21.69 279380 13000
2 Bishan Central 7.62 88010 12000
3 Boon Lay West 8.23 30 3.6
4 Bukit Batok West 11.13 153740 14000
In [18]:
# rename columns 'Name (English)' to 'Town', 'Population[7]' to 'Population'
sg_population_density_raw.rename(columns={'Name (English)': 'Town', 'Population[7]': 'Population'}, inplace=True)
sg_population_density_raw.head()
Out[18]:
Town Region Area (km2) Population Density (/km2)
0 Ang Mo Kio North-East 13.94 163950 13400
1 Bedok East 21.69 279380 13000
2 Bishan Central 7.62 88010 12000
3 Boon Lay West 8.23 30 3.6
4 Bukit Batok West 11.13 153740 14000
In [19]:
# create new dataframe 'sg_population_density'
sg_population_density = sg_population_density_raw

# drop columns 'Region', 'Area (km2)', 'Population' from 'sg_population_density' dataframe
sg_population_density.drop(['Region', 'Area (km2)', 'Population'], axis = 1, inplace=True)

# rename column 'Density (/km2)' to 'Population Density (people/km2)'
sg_population_density.rename(columns={'Density (/km2)': 'Population Density (people/km2)'}, inplace=True)

sg_population_density.head()
Out[19]:
Town Population Density (people/km2)
0 Ang Mo Kio 13400
1 Bedok 13000
2 Bishan 12000
3 Boon Lay 3.6
4 Bukit Batok 14000
In [20]:
# capitalize values for 'Town'
sg_population_density['Town'] = sg_population_density['Town'].str.upper()
sg_population_density
Out[20]:
Town Population Density (people/km2)
0 ANG MO KIO 13400
1 BEDOK 13000
2 BISHAN 12000
3 BOON LAY 3.6
4 BUKIT BATOK 14000
5 BUKIT MERAH 11000
6 BUKIT PANJANG 15000
7 BUKIT TIMAH 4400
8 CENTRAL WATER CATCHMENT *
9 CHANGI 80.62
10 CHANGI BAY *
11 CHOA CHU KANG 30000
12 CLEMENTI 9800
13 DOWNTOWN CORE 680
14 GEYLANG 11400
15 HOUGANG 16000
16 JURONG EAST 4400
17 JURONG WEST 18000
18 KALLANG 11000
19 LIM CHU KANG 5.2
20 MANDAI 180.2
21 MARINA EAST *
22 MARINA SOUTH *
23 MARINE PARADE 8000
24 MUSEUM 480
25 NEWTON 3800
26 NORTH-EASTERN ISLANDS 1.2
27 NOVENA 5600
28 ORCHARD 960.3
29 OUTRAM 13500
30 PASIR RIS 9600
31 PAYA LEBAR 3.4
32 PIONEER 8.3
33 PUNGGOL 17800
34 QUEENSTOWN 4400
35 RIVER VALLEY 6800
36 ROCHOR 8300
37 SELETAR 26.3
38 SEMBAWANG 8400
39 SENGKANG 23000
40 SERANGOON 11500
41 SIMPANG *
42 SINGAPORE RIVER 3000
43 SOUTHERN ISLANDS 244
44 STRAITS VIEW *
45 SUNGEI KADUT 53.2
46 TAMPINES 12400
47 TANGLIN 2800
48 TENGAH 1.4
49 TOA PAYOH 14300
50 TUAS 2.3
51 WESTERN ISLANDS 0.25
52 WESTERN WATER CATCHMENT 13
53 WOODLANDS 18700
54 YISHUN 10100
In [21]:
# replace "*" to 0
sg_population_density.replace('*', 0, inplace = True)
sg_population_density
Out[21]:
Town Population Density (people/km2)
0 ANG MO KIO 13400
1 BEDOK 13000
2 BISHAN 12000
3 BOON LAY 3.6
4 BUKIT BATOK 14000
5 BUKIT MERAH 11000
6 BUKIT PANJANG 15000
7 BUKIT TIMAH 4400
8 CENTRAL WATER CATCHMENT 0
9 CHANGI 80.62
10 CHANGI BAY 0
11 CHOA CHU KANG 30000
12 CLEMENTI 9800
13 DOWNTOWN CORE 680
14 GEYLANG 11400
15 HOUGANG 16000
16 JURONG EAST 4400
17 JURONG WEST 18000
18 KALLANG 11000
19 LIM CHU KANG 5.2
20 MANDAI 180.2
21 MARINA EAST 0
22 MARINA SOUTH 0
23 MARINE PARADE 8000
24 MUSEUM 480
25 NEWTON 3800
26 NORTH-EASTERN ISLANDS 1.2
27 NOVENA 5600
28 ORCHARD 960.3
29 OUTRAM 13500
30 PASIR RIS 9600
31 PAYA LEBAR 3.4
32 PIONEER 8.3
33 PUNGGOL 17800
34 QUEENSTOWN 4400
35 RIVER VALLEY 6800
36 ROCHOR 8300
37 SELETAR 26.3
38 SEMBAWANG 8400
39 SENGKANG 23000
40 SERANGOON 11500
41 SIMPANG 0
42 SINGAPORE RIVER 3000
43 SOUTHERN ISLANDS 244
44 STRAITS VIEW 0
45 SUNGEI KADUT 53.2
46 TAMPINES 12400
47 TANGLIN 2800
48 TENGAH 1.4
49 TOA PAYOH 14300
50 TUAS 2.3
51 WESTERN ISLANDS 0.25
52 WESTERN WATER CATCHMENT 13
53 WOODLANDS 18700
54 YISHUN 10100
In [22]:
# Convert 'Population Density (people/km2)' to float64
sg_population_density['Population Density (people/km2)']=sg_population_density['Population Density (people/km2)'].astype(np.float64)
sg_population_density.head()
Out[22]:
Town Population Density (people/km2)
0 ANG MO KIO 13400.0
1 BEDOK 13000.0
2 BISHAN 12000.0
3 BOON LAY 3.6
4 BUKIT BATOK 14000.0
In [23]:
# sort by Population Density (people/km2)
sg_population_density_graph = sg_population_density.sort_values('Population Density (people/km2)', ascending=True)
sg_population_density_graph
Out[23]:
Town Population Density (people/km2)
44 STRAITS VIEW 0.00
41 SIMPANG 0.00
22 MARINA SOUTH 0.00
8 CENTRAL WATER CATCHMENT 0.00
10 CHANGI BAY 0.00
21 MARINA EAST 0.00
51 WESTERN ISLANDS 0.25
26 NORTH-EASTERN ISLANDS 1.20
48 TENGAH 1.40
50 TUAS 2.30
31 PAYA LEBAR 3.40
3 BOON LAY 3.60
19 LIM CHU KANG 5.20
32 PIONEER 8.30
52 WESTERN WATER CATCHMENT 13.00
37 SELETAR 26.30
45 SUNGEI KADUT 53.20
9 CHANGI 80.62
20 MANDAI 180.20
43 SOUTHERN ISLANDS 244.00
24 MUSEUM 480.00
13 DOWNTOWN CORE 680.00
28 ORCHARD 960.30
47 TANGLIN 2800.00
42 SINGAPORE RIVER 3000.00
25 NEWTON 3800.00
16 JURONG EAST 4400.00
7 BUKIT TIMAH 4400.00
34 QUEENSTOWN 4400.00
27 NOVENA 5600.00
35 RIVER VALLEY 6800.00
23 MARINE PARADE 8000.00
36 ROCHOR 8300.00
38 SEMBAWANG 8400.00
30 PASIR RIS 9600.00
12 CLEMENTI 9800.00
54 YISHUN 10100.00
18 KALLANG 11000.00
5 BUKIT MERAH 11000.00
14 GEYLANG 11400.00
40 SERANGOON 11500.00
2 BISHAN 12000.00
46 TAMPINES 12400.00
1 BEDOK 13000.00
0 ANG MO KIO 13400.00
29 OUTRAM 13500.00
4 BUKIT BATOK 14000.00
49 TOA PAYOH 14300.00
6 BUKIT PANJANG 15000.00
15 HOUGANG 16000.00
33 PUNGGOL 17800.00
17 JURONG WEST 18000.00
53 WOODLANDS 18700.00
39 SENGKANG 23000.00
11 CHOA CHU KANG 30000.00
In [24]:
sg_population_density_graph.shape
Out[24]:
(55, 2)
In [25]:
# plot horizontal bar chart for Population Density by Town
sg_population_density_graph.plot(kind='barh', figsize=(12,12), color='steelblue')
plt.xlabel('Population Density (people/km2)')
plt.yticks(range(55), sg_population_density_graph['Town'])
plt.title('Population Density by Town / Planning Area in Singapore')

plt.show()
In [26]:
# plot histogram for Population Density
sg_population_density_graph.plot(kind='hist', figsize=(12,8))
plt.xlabel('Population Density (people/km2)')
plt.ylabel('Number of Towns in Singapore')
plt.title('Histogram of Population Density for Towns / Planning Areas in Singapore')

plt.show()

3. Correlation Analysis for Median Rent and Population Density

In [27]:
# merge sg_median_rent and sg_population_density datasets on 'Town' column value
sg_corr_data = pd.merge(sg_median_rent, sg_population_density, on='Town')
sg_corr_data.head()
Out[27]:
Town Median Rent (SGD/month) Population Density (people/km2)
0 ANG MO KIO 2100.0 13400.0
1 BEDOK 2000.0 13000.0
2 BISHAN 2250.0 12000.0
3 BUKIT BATOK 1900.0 14000.0
4 BUKIT MERAH 2500.0 11000.0
In [28]:
# plot scatter plot of Population Density and Median Rent
sns.regplot(x='Population Density (people/km2)', y='Median Rent (SGD/month)', data=sg_corr_data)
Out[28]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fcd5b8555c0>
In [29]:
# examine correlation between Population Density and Median Rent
sg_corr_data[['Population Density (people/km2)', 'Median Rent (SGD/month)']].corr()
Out[29]:
Population Density (people/km2) Median Rent (SGD/month)
Population Density (people/km2) 1.000000 -0.450388
Median Rent (SGD/month) -0.450388 1.000000

Population Density provides some predictive value for Median Rent, with correlation at approximately -0.450

4. Singapore Town Location Data

Part 1: Town Center Location

Geographic coordinates of town centers retrieved using Google Maps, with coordinates of MRT stations being used as the center for all towns for the purpose of this analysis. Data from this exercise is saved in separate CSV file.

In [30]:
# Retrieve csv file for Town Center coordinates
sg_town_center = pd.read_csv('https://raw.githubusercontent.com/RaphaelO-SG/Coursera_Capstone/master/The%20Battle%20of%20Neighborhoods/MRT-station-coordinates.csv')
sg_town_center.head()
Out[30]:
town lat lng
0 ANG MO KIO 1.370017 103.849450
1 BEDOK 1.324039 103.930036
2 BISHAN 1.351236 103.848456
3 BUKIT BATOK 1.349073 103.749664
4 BUKIT MERAH 1.281906 103.823919
In [31]:
# rename columns 'town' to 'Town', 'lat' to 'Town Latitude', 'lng' to 'Town Longitude'
sg_town_center.rename(columns={'town': 'Town', 'lat': 'Town Latitude', 'lng': 'Town Longitude'}, inplace=True)
sg_town_center
Out[31]:
Town Town Latitude Town Longitude
0 ANG MO KIO 1.370017 103.849450
1 BEDOK 1.324039 103.930036
2 BISHAN 1.351236 103.848456
3 BUKIT BATOK 1.349073 103.749664
4 BUKIT MERAH 1.281906 103.823919
5 BUKIT PANJANG 1.378436 103.761767
6 BUKIT TIMAH 1.330714 103.797633
7 OUTRAM 1.280225 103.839486
8 CHOA CHU KANG 1.385092 103.744322
9 CLEMENTI 1.315303 103.765244
10 GEYLANG 1.321382 103.871414
11 HOUGANG 1.371292 103.892161
12 JURONG EAST 1.333415 103.742119
13 JURONG WEST 1.338883 103.706208
14 KALLANG 1.311469 103.871400
15 MARINE PARADE 1.302720 103.905248
16 PASIR RIS 1.372411 103.949369
17 PUNGGOL 1.405264 103.902097
18 QUEENSTOWN 1.294442 103.806114
19 SEMBAWANG 1.449025 103.820153
20 SENGKANG 1.391653 103.895133
21 SERANGOON 1.349944 103.873092
22 TAMPINES 1.352528 103.945322
23 TOA PAYOH 1.332703 103.847808
24 WOODLANDS 1.437094 103.786483
25 YISHUN 1.429464 103.835239
Part 2: Create merged dataframe containing Town, Town Center Coordinates, Median Rent, Population Density by Town
In [32]:
# merge sg_town_center and sg_median_rent datasets on 'Town' column value
sg_town_data = pd.merge(sg_town_center, sg_median_rent, on='Town')
sg_town_data.head()
Out[32]:
Town Town Latitude Town Longitude Median Rent (SGD/month)
0 ANG MO KIO 1.370017 103.849450 2100.0
1 BEDOK 1.324039 103.930036 2000.0
2 BISHAN 1.351236 103.848456 2250.0
3 BUKIT BATOK 1.349073 103.749664 1900.0
4 BUKIT MERAH 1.281906 103.823919 2500.0
In [33]:
# merge sg_town_data and sg_population_density datasets on 'Town' column value
sg_town_data = pd.merge(sg_town_data, sg_population_density, on='Town')
sg_town_data
Out[33]:
Town Town Latitude Town Longitude Median Rent (SGD/month) Population Density (people/km2)
0 ANG MO KIO 1.370017 103.849450 2100.0 13400.0
1 BEDOK 1.324039 103.930036 2000.0 13000.0
2 BISHAN 1.351236 103.848456 2250.0 12000.0
3 BUKIT BATOK 1.349073 103.749664 1900.0 14000.0
4 BUKIT MERAH 1.281906 103.823919 2500.0 11000.0
5 BUKIT PANJANG 1.378436 103.761767 1700.0 15000.0
6 OUTRAM 1.280225 103.839486 2550.0 13500.0
7 CHOA CHU KANG 1.385092 103.744322 1800.0 30000.0
8 CLEMENTI 1.315303 103.765244 2400.0 9800.0
9 GEYLANG 1.321382 103.871414 2250.0 11400.0
10 HOUGANG 1.371292 103.892161 1900.0 16000.0
11 JURONG EAST 1.333415 103.742119 2000.0 4400.0
12 JURONG WEST 1.338883 103.706208 1950.0 18000.0
13 KALLANG 1.311469 103.871400 2400.0 11000.0
14 MARINE PARADE 1.302720 103.905248 2100.0 8000.0
15 PASIR RIS 1.372411 103.949369 2000.0 9600.0
16 PUNGGOL 1.405264 103.902097 1900.0 17800.0
17 QUEENSTOWN 1.294442 103.806114 2600.0 4400.0
18 SEMBAWANG 1.449025 103.820153 1800.0 8400.0
19 SENGKANG 1.391653 103.895133 1900.0 23000.0
20 SERANGOON 1.349944 103.873092 2100.0 11500.0
21 TAMPINES 1.352528 103.945322 2050.0 12400.0
22 TOA PAYOH 1.332703 103.847808 2300.0 14300.0
23 WOODLANDS 1.437094 103.786483 1700.0 18700.0
24 YISHUN 1.429464 103.835239 1800.0 10100.0
Part 3: Generate map of Singapore with towns superimposed on top
In [34]:
# get geographical coordinates for Singapore with geopy library
address = 'Singapore'

geolocator = Nominatim(user_agent="SG explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Singapore are {}, {}.'.format(latitude, longitude))
The geograpical coordinate of Singapore are 1.357107, 103.8194992.

Generate map of Singapore with Towns and Median Rent (SGD / month) superimposed on top.

In [35]:
# create map of Singapore using latitude and longitude values
map_singapore = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, town, median_rent in zip(sg_town_data['Town Latitude'], sg_town_data['Town Longitude'], sg_town_data['Town'],  sg_town_data['Median Rent (SGD/month)']):
    label = '{}, SGD {} / month'.format(town, median_rent)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_singapore)  
    
map_singapore
Out[35]:
Part 4: Generate choropleth map of Singapore based on median rent, with towns superimposed on top
In [36]:
# Retrieve csv file containing Planning Area names in GeoJSON file by Town
sg_planning_area = pd.read_csv('https://raw.githubusercontent.com/RaphaelO-SG/Coursera_Capstone/master/The%20Battle%20of%20Neighborhoods/singapore-2019-planning-area.csv')
sg_planning_area.head()
Out[36]:
name town
0 kml_52 ANG MO KIO
1 kml_1 BEDOK
2 kml_51 BISHAN
3 kml_2 BOON LAY
4 kml_3 BUKIT BATOK
In [37]:
# rename column 'name' to 'Name', and 'town' to 'Town'
sg_planning_area.rename(columns={'name': 'Name', 'town': 'Town'}, inplace=True)
sg_planning_area.head()
Out[37]:
Name Town
0 kml_52 ANG MO KIO
1 kml_1 BEDOK
2 kml_51 BISHAN
3 kml_2 BOON LAY
4 kml_3 BUKIT BATOK
In [38]:
# merge datasets on Town column value
sg_choropleth = pd.merge(sg_planning_area, sg_population_density, on='Town')
sg_choropleth.head()
Out[38]:
Name Town Population Density (people/km2)
0 kml_52 ANG MO KIO 13400.0
1 kml_1 BEDOK 13000.0
2 kml_51 BISHAN 12000.0
3 kml_2 BOON LAY 3.6
4 kml_3 BUKIT BATOK 14000.0
In [39]:
# download Singapore geojson file
!wget --quiet https://github.com/RaphaelO-SG/Coursera_Capstone/raw/master/The%20Battle%20of%20Neighborhoods/master-plan-2019-planning-area-boundary-no-sea-geojson.geojson
    
print('GeoJSON file downloaded!')
GeoJSON file downloaded!
In [40]:
sg_geo = r'master-plan-2019-planning-area-boundary-no-sea-geojson.geojson'

# create map of Singapore using latitude and longitude values
map_singapore_choropleth = folium.Map(location=[latitude, longitude], zoom_start=12)

# create a numpy array of length 6 and has linear spacing from the minimum to maximum population density
threshold_scale = np.linspace(sg_choropleth['Population Density (people/km2)'].min(),
                            sg_choropleth['Population Density (people/km2)'].max(),
                            6, dtype=int)
threshold_scale = threshold_scale.tolist() # change the numpy array to a list
threshold_scale[-1] = threshold_scale[-1] + 0.01 # make sure that the last value of the list is greater than the maximum population density

# add choropleth layer on map
map_singapore_choropleth.choropleth(
    geo_data=sg_geo,
    data=sg_choropleth,
    columns=['Name', 'Population Density (people/km2)'],
    key_on='feature.properties.Name',
    threshold_scale=threshold_scale,
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Sigapore Population Density (people/km2) by Planning Area'
)

# add markers to map
for lat, lng, town, median_rent in zip(sg_town_data['Town Latitude'], sg_town_data['Town Longitude'], sg_town_data['Town'],  sg_town_data['Median Rent (SGD/month)']):
    label = '{}, SGD {} / month'.format(town, median_rent)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_singapore_choropleth)  

# display map
map_singapore_choropleth
Out[40]:

5. Singapore Venue Information from Foursquare API

Part 1: Define Foursquare Credentials and Version
In [41]:
# @hidden_cell
CLIENT_ID = 'ADQ4BCC5UBXF1NXEMCBZPLUDSG5NYWQARLFQTG4CMTYQ35LU' # your Foursquare ID
CLIENT_SECRET = 'HWC4WHHMM3XE4UBBLCWRNLB43WAMNYSP4E0O2LJGAADOUFCW' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentials:')
print('CLIENT_ID: loaded')
print('CLIENT_SECRET: loaded')
Your credentials:
CLIENT_ID: loaded
CLIENT_SECRET: loaded
Part 2: Explore All Towns

Define funtion to get latitudes and longitudes for each town, and the venues within a radius of 500 meters.

In [42]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, limit=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Town', 
                  'Town Latitude', 
                  'Town Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Run the above function on each town and create a new dataframe sg_venues

In [43]:
# call getNearbyVenues for each Neighborhood
sg_venues = getNearbyVenues(names=sg_town_data['Town'],
                                   latitudes=sg_town_data['Town Latitude'],
                                   longitudes=sg_town_data['Town Longitude']
                                  )
ANG MO KIO
BEDOK
BISHAN
BUKIT BATOK
BUKIT MERAH
BUKIT PANJANG
OUTRAM
CHOA CHU KANG
CLEMENTI
GEYLANG
HOUGANG
JURONG EAST
JURONG WEST
KALLANG
MARINE PARADE
PASIR RIS
PUNGGOL
QUEENSTOWN
SEMBAWANG
SENGKANG
SERANGOON
TAMPINES
TOA PAYOH
WOODLANDS
YISHUN

Check size of resulting dataframe

In [44]:
print (sg_venues.shape)
sg_venues.head()
(1005, 7)
Out[44]:
Town Town Latitude Town Longitude Venue Venue Latitude Venue Longitude Venue Category
0 ANG MO KIO 1.370017 103.84945 FairPrice Xtra 1.369279 103.848886 Supermarket
1 ANG MO KIO 1.370017 103.84945 Old Chang Kee 1.369094 103.848389 Snack Place
2 ANG MO KIO 1.370017 103.84945 Subway 1.369136 103.847612 Sandwich Place
3 ANG MO KIO 1.370017 103.84945 MOS Burger 1.369170 103.847831 Burger Joint
4 ANG MO KIO 1.370017 103.84945 NTUC FairPrice 1.371507 103.847082 Supermarket

Check number of venues returned for each town

In [45]:
sg_venues.groupby('Town').count()
Out[45]:
Town Latitude Town Longitude Venue Venue Latitude Venue Longitude Venue Category
Town
ANG MO KIO 43 43 43 43 43 43
BEDOK 59 59 59 59 59 59
BISHAN 47 47 47 47 47 47
BUKIT BATOK 23 23 23 23 23 23
BUKIT MERAH 10 10 10 10 10 10
BUKIT PANJANG 40 40 40 40 40 40
CHOA CHU KANG 20 20 20 20 20 20
CLEMENTI 62 62 62 62 62 62
GEYLANG 8 8 8 8 8 8
HOUGANG 43 43 43 43 43 43
JURONG EAST 77 77 77 77 77 77
JURONG WEST 71 71 71 71 71 71
KALLANG 17 17 17 17 17 17
MARINE PARADE 57 57 57 57 57 57
OUTRAM 60 60 60 60 60 60
PASIR RIS 38 38 38 38 38 38
PUNGGOL 45 45 45 45 45 45
QUEENSTOWN 16 16 16 16 16 16
SEMBAWANG 17 17 17 17 17 17
SENGKANG 30 30 30 30 30 30
SERANGOON 38 38 38 38 38 38
TAMPINES 57 57 57 57 57 57
TOA PAYOH 27 27 27 27 27 27
WOODLANDS 53 53 53 53 53 53
YISHUN 47 47 47 47 47 47

Number of unique categories curated from all the returned venues

In [46]:
print('There are {} uniques categories.'.format(len(sg_venues['Venue Category'].unique())))
There are 163 uniques categories.
Part 3: Analyze Neighoborhoods of Each Town
In [47]:
# one hot encoding
sg_onehot = pd.get_dummies(sg_venues[['Venue Category']], prefix="", prefix_sep="")

# add Town column back to dataframe
sg_onehot['Town'] = sg_venues['Town'] 

# move Town column to the first column
sg_onehot.drop(labels=['Town'], axis=1,inplace = True)
sg_onehot.insert(loc=0, column='Town', value=sg_venues['Town'].to_list())

sg_onehot.head()
Out[47]:
Town Accessories Store American Restaurant Arcade Art Gallery Arts & Crafts Store Asian Restaurant Athletics & Sports Australian Restaurant BBQ Joint Bakery Bank Basketball Court Betting Shop Bike Rental / Bike Share Bistro Bookstore Bowling Alley Breakfast Spot Bubble Tea Shop Buffet Burger Joint Burrito Place Bus Line Bus Station Bus Stop Café Candy Store Cantonese Restaurant Cha Chaan Teng Chinese Breakfast Place Chinese Restaurant Clothing Store Cocktail Bar Coffee Shop Concert Hall Convenience Store Cosmetics Shop Dance Studio Department Store Dessert Shop Dim Sum Restaurant Diner Discount Store Donut Shop Dumpling Restaurant Eastern European Restaurant Electronics Store Farmers Market Fast Food Restaurant Filipino Restaurant Fish & Chips Shop Fishing Spot Food & Drink Shop Food Court Food Service Food Stand French Restaurant Fried Chicken Joint Frozen Yogurt Shop Furniture / Home Store Gaming Cafe Garden Gastropub Gay Bar German Restaurant Gift Shop Golf Course Grocery Store Gym Gym / Fitness Center Hainan Restaurant Halal Restaurant Historic Site Hobby Shop Hong Kong Restaurant Hookah Bar Hostel Hotel Hotel Bar Hotpot Restaurant Ice Cream Shop Indian Restaurant Indonesian Restaurant Italian Restaurant Japanese Curry Restaurant Japanese Restaurant Juice Bar Karaoke Bar Kids Store Kitchen Supply Store Korean Restaurant Light Rail Station Lounge Malay Restaurant Massage Studio Mediterranean Restaurant Metro Station Mexican Restaurant Miscellaneous Shop Mobile Phone Shop Modern European Restaurant Movie Theater Multiplex Nail Salon Neighborhood Night Market Noodle House Optical Shop Outlet Store Park Pet Store Pharmacy Pizza Place Playground Plaza Pool Pool Hall Portuguese Restaurant Post Office Pub Ramen Restaurant Recreation Center Residential Building (Apartment / Condo) Restaurant Rock Club Salad Place Salon / Barbershop Sandwich Place Scenic Lookout Sculpture Garden Seafood Restaurant Shaanxi Restaurant Shoe Store Shopping Mall Skate Park Skating Rink Snack Place Soup Place Spa Spanish Restaurant Speakeasy Sporting Goods Shop Sports Bar Sports Club Stadium Steakhouse Supermarket Sushi Restaurant Tapas Restaurant Taxi Taxi Stand Tea Room Thai Restaurant Thrift / Vintage Store Toy / Game Store Trail Train Station Vegetarian / Vegan Restaurant Video Game Store Vietnamese Restaurant Wine Bar Wings Joint Yoga Studio
0 ANG MO KIO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 ANG MO KIO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 ANG MO KIO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 ANG MO KIO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 ANG MO KIO 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [48]:
sg_onehot.shape
Out[48]:
(1005, 164)

Group rows by Town and by taking total occurrences of each category

In [49]:
sg_grouped_total = sg_onehot.groupby('Town').sum().reset_index()
sg_grouped_total
Out[49]:
Town Accessories Store American Restaurant Arcade Art Gallery Arts & Crafts Store Asian Restaurant Athletics & Sports Australian Restaurant BBQ Joint Bakery Bank Basketball Court Betting Shop Bike Rental / Bike Share Bistro Bookstore Bowling Alley Breakfast Spot Bubble Tea Shop Buffet Burger Joint Burrito Place Bus Line Bus Station Bus Stop Café Candy Store Cantonese Restaurant Cha Chaan Teng Chinese Breakfast Place Chinese Restaurant Clothing Store Cocktail Bar Coffee Shop Concert Hall Convenience Store Cosmetics Shop Dance Studio Department Store Dessert Shop Dim Sum Restaurant Diner Discount Store Donut Shop Dumpling Restaurant Eastern European Restaurant Electronics Store Farmers Market Fast Food Restaurant Filipino Restaurant Fish & Chips Shop Fishing Spot Food & Drink Shop Food Court Food Service Food Stand French Restaurant Fried Chicken Joint Frozen Yogurt Shop Furniture / Home Store Gaming Cafe Garden Gastropub Gay Bar German Restaurant Gift Shop Golf Course Grocery Store Gym Gym / Fitness Center Hainan Restaurant Halal Restaurant Historic Site Hobby Shop Hong Kong Restaurant Hookah Bar Hostel Hotel Hotel Bar Hotpot Restaurant Ice Cream Shop Indian Restaurant Indonesian Restaurant Italian Restaurant Japanese Curry Restaurant Japanese Restaurant Juice Bar Karaoke Bar Kids Store Kitchen Supply Store Korean Restaurant Light Rail Station Lounge Malay Restaurant Massage Studio Mediterranean Restaurant Metro Station Mexican Restaurant Miscellaneous Shop Mobile Phone Shop Modern European Restaurant Movie Theater Multiplex Nail Salon Neighborhood Night Market Noodle House Optical Shop Outlet Store Park Pet Store Pharmacy Pizza Place Playground Plaza Pool Pool Hall Portuguese Restaurant Post Office Pub Ramen Restaurant Recreation Center Residential Building (Apartment / Condo) Restaurant Rock Club Salad Place Salon / Barbershop Sandwich Place Scenic Lookout Sculpture Garden Seafood Restaurant Shaanxi Restaurant Shoe Store Shopping Mall Skate Park Skating Rink Snack Place Soup Place Spa Spanish Restaurant Speakeasy Sporting Goods Shop Sports Bar Sports Club Stadium Steakhouse Supermarket Sushi Restaurant Tapas Restaurant Taxi Taxi Stand Tea Room Thai Restaurant Thrift / Vintage Store Toy / Game Store Trail Train Station Vegetarian / Vegan Restaurant Video Game Store Vietnamese Restaurant Wine Bar Wings Joint Yoga Studio
0 ANG MO KIO 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 2 0 1 0 0 0 0 0 0 0 0 0 1 0 0 4 0 1 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 4 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 BEDOK 0 1 0 0 0 3 0 0 0 2 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 2 0 0 0 0 4 1 0 5 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 3 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 3 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 3 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 2 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0
2 BISHAN 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 0 0 1 0 3 0 0 0 0 3 0 0 4 0 0 2 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 0 5 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
3 BUKIT BATOK 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 3 0 1 1 0 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 BUKIT MERAH 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 BUKIT PANJANG 0 1 0 0 0 2 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 3 0 0 0 1 1 0 0 3 0 1 0 0 0 0 0 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 2 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 2 3 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
6 CHOA CHU KANG 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
7 CLEMENTI 0 1 0 0 1 4 0 0 2 1 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 2 3 0 0 6 0 0 0 0 0 3 2 0 0 0 0 0 2 0 2 0 0 0 0 5 0 0 0 2 0 0 0 0 0 0 0 0 0 1 2 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 2 0 0 0 0 0 1 0 0 0 0
8 GEYLANG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 HOUGANG 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 2 0 0 0 0 3 0 0 4 0 0 0 0 1 2 0 1 0 0 0 0 0 0 4 0 1 0 0 3 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 2 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
10 JURONG EAST 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 2 1 1 0 0 1 0 4 1 0 0 0 5 2 0 4 0 0 0 0 2 1 0 0 0 0 1 0 0 0 1 0 0 0 0 4 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 6 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 2 0 0 1 0 2 3 0 1 0 0 1 0 0 1 0 0 0 1 2 2 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0
11 JURONG WEST 0 1 0 0 0 7 0 0 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 3 0 0 1 0 5 0 0 3 1 1 0 0 0 5 0 0 0 1 0 0 1 0 5 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 2 0 0 1 6 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
12 KALLANG 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
13 MARINE PARADE 0 0 0 0 0 0 1 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 2 1 0 2 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 1 0 1 2 1 0 0 4 1 0 0 0 0 0 1 0 3 0 0 0 0 0 0 0 3 0 1 0 1 1 0 0 0 0 1 0 0 0 2 0 0 1 0 1 0 1 0 2 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
14 OUTRAM 0 2 0 1 0 2 0 2 0 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 6 0 1 0 0 2 0 1 4 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 0 0 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 5 0 4 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 2 0 0
15 PASIR RIS 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 1 1 2 0 0 1 0 0 0 3 0 0 1 1 3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 1 2 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
16 PUNGGOL 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 2 0 0 0 0 2 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 1 0 2 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 1 0 0 0 1 1 0 0 0 0 0 0 1 2 1 0 1 1 0 0 0 1 0 0 2 0 1 0 0 0
17 QUEENSTOWN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
18 SEMBAWANG 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
19 SENGKANG 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 3 0 1 2 0 0 1 0 0 0 0 0 0 0 0 3 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
20 SERANGOON 0 0 0 0 0 2 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 2 3 0 4 0 0 1 0 1 1 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
21 TAMPINES 0 1 0 0 0 1 0 0 0 5 0 0 0 0 0 1 0 0 2 0 0 0 0 1 1 4 0 0 0 0 2 2 0 3 0 0 0 0 0 2 0 0 0 0 1 0 1 0 2 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 2 1 0 1 1 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0
22 TOA PAYOH 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 3 0 0 3 0 0 1 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
23 WOODLANDS 0 1 1 0 0 5 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 3 0 0 0 0 2 2 0 3 0 1 0 0 1 1 0 0 0 0 0 0 2 0 2 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 2 0 0 0 3 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
24 YISHUN 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 3 0 0 2 0 0 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 3 0 0 0 2 0 1 0 0 0 0 0 0 0 1 1 0 2 1 0 1 0 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 2 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0
In [50]:
sg_grouped_total.shape
Out[50]:
(25, 164)
In [51]:
# Understand which are the most common Venue Categories across all Towns
sg_venues_count = sg_venues.groupby('Venue Category').count()
sg_venues_count = sg_venues_count.sort_values('Venue', ascending=False).reset_index()
sg_venues_count.head()
Out[51]:
Venue Category Town Town Latitude Town Longitude Venue Venue Latitude Venue Longitude
0 Coffee Shop 72 72 72 72 72 72
1 Chinese Restaurant 55 55 55 55 55 55
2 Food Court 50 50 50 50 50 50
3 Japanese Restaurant 42 42 42 42 42 42
4 Café 42 42 42 42 42 42
In [52]:
# drop columns 'Town', 'Town Latitude', 'Town Longitude', 'Venue Latitude', 'Venue Longitude' from 'sg_venues_count' dataframe
sg_venues_count.drop(['Town', 'Town Latitude', 'Town Longitude', 'Venue Latitude', 'Venue Longitude'], axis = 1, inplace=True)

# rename column 'Venue' to 'Venue Count'
sg_venues_count.rename(columns={'Venue': 'Venue Count'}, inplace=True)

sg_venues_count
Out[52]:
Venue Category Venue Count
0 Coffee Shop 72
1 Chinese Restaurant 55
2 Food Court 50
3 Japanese Restaurant 42
4 Café 42
5 Fast Food Restaurant 40
6 Asian Restaurant 35
7 Supermarket 30
8 Shopping Mall 27
9 Dessert Shop 24
10 Bakery 22
11 Sandwich Place 19
12 Italian Restaurant 17
13 Sushi Restaurant 17
14 Bubble Tea Shop 16
15 Noodle House 14
16 Indian Restaurant 14
17 Multiplex 13
18 Gym 13
19 Fried Chicken Joint 12
20 Bus Station 12
21 Seafood Restaurant 12
22 Snack Place 12
23 Clothing Store 12
24 Bookstore 11
25 Restaurant 10
26 Ice Cream Shop 10
27 Park 10
28 Thai Restaurant 9
29 Department Store 8
30 Cosmetics Shop 8
31 Convenience Store 8
32 Electronics Store 8
33 Soup Place 8
34 Pool 8
35 Frozen Yogurt Shop 8
36 American Restaurant 8
37 BBQ Joint 7
38 Pharmacy 6
39 Dumpling Restaurant 6
40 Hotel 6
41 Spa 6
42 Burger Joint 6
43 Steakhouse 6
44 Vegetarian / Vegan Restaurant 5
45 Korean Restaurant 5
46 Karaoke Bar 5
47 Malay Restaurant 5
48 Diner 4
49 Hotpot Restaurant 4
50 Bistro 4
51 Grocery Store 4
52 Indonesian Restaurant 4
53 Pet Store 4
54 Pizza Place 4
55 Shoe Store 4
56 Juice Bar 4
57 Breakfast Spot 4
58 Dim Sum Restaurant 4
59 Chinese Breakfast Place 3
60 Massage Studio 3
61 Kids Store 3
62 Halal Restaurant 3
63 Gym / Fitness Center 3
64 Salad Place 3
65 Australian Restaurant 3
66 Wings Joint 3
67 Trail 3
68 Recreation Center 3
69 Ramen Restaurant 3
70 Portuguese Restaurant 3
71 Arcade 3
72 Playground 3
73 Furniture / Home Store 3
74 Bus Stop 3
75 Mexican Restaurant 2
76 Video Game Store 2
77 Miscellaneous Shop 2
78 Mobile Phone Shop 2
79 Modern European Restaurant 2
80 Movie Theater 2
81 Wine Bar 2
82 Optical Shop 2
83 Plaza 2
84 Pool Hall 2
85 Toy / Game Store 2
86 Sports Bar 2
87 Thrift / Vintage Store 2
88 Tapas Restaurant 2
89 Light Rail Station 2
90 Yoga Studio 2
91 Betting Shop 2
92 Hobby Shop 2
93 Garden 2
94 Gastropub 2
95 Bowling Alley 2
96 Candy Store 2
97 Hainan Restaurant 2
98 French Restaurant 2
99 Hostel 2
100 Athletics & Sports 2
101 Concert Hall 1
102 Skating Rink 1
103 Cocktail Bar 1
104 Cha Chaan Teng 1
105 Skate Park 1
106 Spanish Restaurant 1
107 Speakeasy 1
108 Sporting Goods Shop 1
109 Cantonese Restaurant 1
110 Sports Club 1
111 Dance Studio 1
112 Discount Store 1
113 Stadium 1
114 Kitchen Supply Store 1
115 Burrito Place 1
116 Bus Line 1
117 Donut Shop 1
118 Buffet 1
119 Taxi 1
120 Taxi Stand 1
121 Tea Room 1
122 Bike Rental / Bike Share 1
123 Basketball Court 1
124 Train Station 1
125 Bank 1
126 Vietnamese Restaurant 1
127 Arts & Crafts Store 1
128 Art Gallery 1
129 Shaanxi Restaurant 1
130 Eastern European Restaurant 1
131 Sculpture Garden 1
132 Scenic Lookout 1
133 Japanese Curry Restaurant 1
134 Hotel Bar 1
135 Mediterranean Restaurant 1
136 Metro Station 1
137 Hookah Bar 1
138 Hong Kong Restaurant 1
139 Historic Site 1
140 Nail Salon 1
141 Neighborhood 1
142 Night Market 1
143 Golf Course 1
144 Gift Shop 1
145 Outlet Store 1
146 German Restaurant 1
147 Gay Bar 1
148 Gaming Cafe 1
149 Food Stand 1
150 Food Service 1
151 Food & Drink Shop 1
152 Post Office 1
153 Pub 1
154 Fishing Spot 1
155 Fish & Chips Shop 1
156 Residential Building (Apartment / Condo) 1
157 Filipino Restaurant 1
158 Rock Club 1
159 Farmers Market 1
160 Salon / Barbershop 1
161 Lounge 1
162 Accessories Store 1
In [53]:
# get data for Top 40 Venue Categories
sg_venues_top40 = sg_venues_count.head(40)
sg_venues_top40
Out[53]:
Venue Category Venue Count
0 Coffee Shop 72
1 Chinese Restaurant 55
2 Food Court 50
3 Japanese Restaurant 42
4 Café 42
5 Fast Food Restaurant 40
6 Asian Restaurant 35
7 Supermarket 30
8 Shopping Mall 27
9 Dessert Shop 24
10 Bakery 22
11 Sandwich Place 19
12 Italian Restaurant 17
13 Sushi Restaurant 17
14 Bubble Tea Shop 16
15 Noodle House 14
16 Indian Restaurant 14
17 Multiplex 13
18 Gym 13
19 Fried Chicken Joint 12
20 Bus Station 12
21 Seafood Restaurant 12
22 Snack Place 12
23 Clothing Store 12
24 Bookstore 11
25 Restaurant 10
26 Ice Cream Shop 10
27 Park 10
28 Thai Restaurant 9
29 Department Store 8
30 Cosmetics Shop 8
31 Convenience Store 8
32 Electronics Store 8
33 Soup Place 8
34 Pool 8
35 Frozen Yogurt Shop 8
36 American Restaurant 8
37 BBQ Joint 7
38 Pharmacy 6
39 Dumpling Restaurant 6
In [54]:
# sort by Venue Count
sg_venues_top40_graph = sg_venues_top40.sort_values('Venue Count', ascending=True)
sg_venues_top40_graph.head()
Out[54]:
Venue Category Venue Count
39 Dumpling Restaurant 6
38 Pharmacy 6
37 BBQ Joint 7
36 American Restaurant 8
30 Cosmetics Shop 8
In [55]:
# plot horizontal bar chart for Top 40 Venue Categories
sg_venues_top40_graph.plot(kind='barh', figsize=(12,8), color='steelblue')
plt.xlabel('Venue Count')
plt.yticks(range(40), sg_venues_top40_graph['Venue Category'])
plt.title('Top 40 Venue Categories')

plt.show()
In [56]:
# select 10 key amenities to assess if there are a balanced mix of amenities in each town
sg_amenities = sg_grouped_total[['Coffee Shop', 'Food Court', 'Fast Food Restaurant', 'Café', 'Shopping Mall', 'Supermarket', 'Clothing Store', 'Bookstore', 'Convenience Store', 'Gym']]
sg_amenities.head()
Out[56]:
Coffee Shop Food Court Fast Food Restaurant Café Shopping Mall Supermarket Clothing Store Bookstore Convenience Store Gym
0 4 4 1 0 1 2 0 0 1 1
1 5 3 2 2 1 2 1 1 0 0
2 4 5 0 3 1 2 0 1 0 1
3 3 2 2 1 1 0 0 0 1 0
4 1 0 0 0 0 0 0 0 1 0
In [57]:
# For 10 key amenities selected, use 0 to indicate absence, and 1 to indicate presence in each town
sg_amenities = sg_amenities.clip(upper=1)
sg_amenities.head()
Out[57]:
Coffee Shop Food Court Fast Food Restaurant Café Shopping Mall Supermarket Clothing Store Bookstore Convenience Store Gym
0 1 1 1 0 1 1 0 0 1 1
1 1 1 1 1 1 1 1 1 0 0
2 1 1 0 1 1 1 0 1 0 1
3 1 1 1 1 1 0 0 0 1 0
4 1 0 0 0 0 0 0 0 1 0
In [58]:
# Determine how many of the 10 key amenities does each town have
sg_amenities['Total Amenities'] = sg_amenities.sum(axis=1)
sg_amenities.head()
Out[58]:
Coffee Shop Food Court Fast Food Restaurant Café Shopping Mall Supermarket Clothing Store Bookstore Convenience Store Gym Total Amenities
0 1 1 1 0 1 1 0 0 1 1 7
1 1 1 1 1 1 1 1 1 0 0 8
2 1 1 0 1 1 1 0 1 0 1 7
3 1 1 1 1 1 0 0 0 1 0 6
4 1 0 0 0 0 0 0 0 1 0 2
In [59]:
# add Town column back to sg_amenities dataframe
sg_amenities['Town'] = sg_grouped_total['Town'] 

# move Town column to the first column
sg_amenities.drop(labels=['Town'], axis=1,inplace = True)
sg_amenities.insert(loc=0, column='Town', value=sg_grouped_total['Town'].to_list())
sg_amenities
Out[59]:
Town Coffee Shop Food Court Fast Food Restaurant Café Shopping Mall Supermarket Clothing Store Bookstore Convenience Store Gym Total Amenities
0 ANG MO KIO 1 1 1 0 1 1 0 0 1 1 7
1 BEDOK 1 1 1 1 1 1 1 1 0 0 8
2 BISHAN 1 1 0 1 1 1 0 1 0 1 7
3 BUKIT BATOK 1 1 1 1 1 0 0 0 1 0 6
4 BUKIT MERAH 1 0 0 0 0 0 0 0 1 0 2
5 BUKIT PANJANG 1 0 1 1 1 1 0 0 1 0 6
6 CHOA CHU KANG 1 1 1 1 0 1 0 0 0 1 6
7 CLEMENTI 1 1 1 0 1 1 0 1 0 1 7
8 GEYLANG 1 1 0 1 0 1 0 0 0 0 4
9 HOUGANG 1 1 1 1 1 1 0 1 0 1 8
10 JURONG EAST 1 1 1 1 1 1 1 1 0 0 8
11 JURONG WEST 1 1 1 1 1 1 0 1 1 0 8
12 KALLANG 0 1 0 0 0 1 0 0 0 0 2
13 MARINE PARADE 1 0 0 1 1 1 1 0 0 0 5
14 OUTRAM 1 1 0 1 0 0 0 1 0 0 4
15 PASIR RIS 1 1 1 0 1 1 0 0 0 1 6
16 PUNGGOL 1 1 1 1 1 1 1 0 1 0 8
17 QUEENSTOWN 0 1 0 1 0 0 0 0 0 0 2
18 SEMBAWANG 1 0 1 0 1 1 0 0 0 0 4
19 SENGKANG 1 1 1 1 1 1 0 0 1 1 8
20 SERANGOON 1 0 1 1 1 1 1 0 0 0 6
21 TAMPINES 1 1 1 1 1 1 1 1 0 1 9
22 TOA PAYOH 1 1 1 1 0 1 0 1 0 0 6
23 WOODLANDS 1 1 1 1 1 1 1 1 1 1 10
24 YISHUN 1 1 1 1 1 1 0 1 0 1 8
In [60]:
# create dataframe with only Total Amenities
sg_amenities_total = sg_amenities[['Town', 'Total Amenities']]
sg_amenities_total.head()
Out[60]:
Town Total Amenities
0 ANG MO KIO 7
1 BEDOK 8
2 BISHAN 7
3 BUKIT BATOK 6
4 BUKIT MERAH 2
In [61]:
# sort by Total Amenities
sg_amenities_graph = sg_amenities_total.sort_values('Total Amenities', ascending=True)
sg_amenities_graph
Out[61]:
Town Total Amenities
12 KALLANG 2
4 BUKIT MERAH 2
17 QUEENSTOWN 2
18 SEMBAWANG 4
8 GEYLANG 4
14 OUTRAM 4
13 MARINE PARADE 5
22 TOA PAYOH 6
20 SERANGOON 6
3 BUKIT BATOK 6
5 BUKIT PANJANG 6
6 CHOA CHU KANG 6
15 PASIR RIS 6
0 ANG MO KIO 7
7 CLEMENTI 7
2 BISHAN 7
11 JURONG WEST 8
10 JURONG EAST 8
16 PUNGGOL 8
9 HOUGANG 8
19 SENGKANG 8
1 BEDOK 8
24 YISHUN 8
21 TAMPINES 9
23 WOODLANDS 10
In [62]:
sg_amenities_graph.shape
Out[62]:
(25, 2)
In [63]:
# plot horizontal bar chart for Total Amenities by Town
sg_amenities_graph.plot(kind='barh', figsize=(12,8), color='steelblue')
plt.xlabel('Count of Key Venue Categories')
plt.yticks(range(25), sg_amenities_graph['Town'])
plt.title('Towns in Singapore with Key Venue Categories Required for Balanced Mix of Amenities')

plt.show()
In [64]:
# plot histogram for Population Density
sg_amenities_graph.plot(kind='hist', figsize=(12,8))
plt.xlabel('Count of Key Venue Categories')
plt.ylabel('Number of Towns in Singapore')
plt.title('Histogram of Towns in Singapore with Key Venue Categories Required for Balanced Mix of Amenities')

plt.show()

6. Correlation Analysis for Median Rent, Population Density and Total Amenities

In [65]:
# merge sg_town_data and sg_amenities_total datasets on 'Town' column value
sg_town_amenities_data = pd.merge(sg_town_data, sg_amenities_total, on='Town')
sg_town_amenities_data
Out[65]:
Town Town Latitude Town Longitude Median Rent (SGD/month) Population Density (people/km2) Total Amenities
0 ANG MO KIO 1.370017 103.849450 2100.0 13400.0 7
1 BEDOK 1.324039 103.930036 2000.0 13000.0 8
2 BISHAN 1.351236 103.848456 2250.0 12000.0 7
3 BUKIT BATOK 1.349073 103.749664 1900.0 14000.0 6
4 BUKIT MERAH 1.281906 103.823919 2500.0 11000.0 2
5 BUKIT PANJANG 1.378436 103.761767 1700.0 15000.0 6
6 OUTRAM 1.280225 103.839486 2550.0 13500.0 4
7 CHOA CHU KANG 1.385092 103.744322 1800.0 30000.0 6
8 CLEMENTI 1.315303 103.765244 2400.0 9800.0 7
9 GEYLANG 1.321382 103.871414 2250.0 11400.0 4
10 HOUGANG 1.371292 103.892161 1900.0 16000.0 8
11 JURONG EAST 1.333415 103.742119 2000.0 4400.0 8
12 JURONG WEST 1.338883 103.706208 1950.0 18000.0 8
13 KALLANG 1.311469 103.871400 2400.0 11000.0 2
14 MARINE PARADE 1.302720 103.905248 2100.0 8000.0 5
15 PASIR RIS 1.372411 103.949369 2000.0 9600.0 6
16 PUNGGOL 1.405264 103.902097 1900.0 17800.0 8
17 QUEENSTOWN 1.294442 103.806114 2600.0 4400.0 2
18 SEMBAWANG 1.449025 103.820153 1800.0 8400.0 4
19 SENGKANG 1.391653 103.895133 1900.0 23000.0 8
20 SERANGOON 1.349944 103.873092 2100.0 11500.0 6
21 TAMPINES 1.352528 103.945322 2050.0 12400.0 9
22 TOA PAYOH 1.332703 103.847808 2300.0 14300.0 6
23 WOODLANDS 1.437094 103.786483 1700.0 18700.0 10
24 YISHUN 1.429464 103.835239 1800.0 10100.0 8
In [66]:
# plot scatter plot of Total Amenities and Median Rent
sns.regplot(x='Total Amenities', y='Median Rent (SGD/month)', data=sg_town_amenities_data)
Out[66]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fcd5ea65b00>
In [67]:
# plot scatter plot of Total Amenities and Population Density
sns.regplot(x='Total Amenities', y='Population Density (people/km2)', data=sg_town_amenities_data)
Out[67]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fcd583652e8>
In [68]:
# examine correlation between Total Amenities and Median Rent
sg_town_amenities_data[['Median Rent (SGD/month)','Population Density (people/km2)','Total Amenities']].corr()
Out[68]:
Median Rent (SGD/month) Population Density (people/km2) Total Amenities
Median Rent (SGD/month) 1.000000 -0.450388 -0.653017
Population Density (people/km2) -0.450388 1.000000 0.364211
Total Amenities -0.653017 0.364211 1.000000

Population Density provides some predictive value for Median Rent, with correlation at approximately -0.450.
Total Amenities provides fairly good predictive value for Median Rent, with correlation at approximately -0.653.
Population Density is however not a good predictor for Total Amenities, with correlation at approximately 0.364.

7. Cluster Towns

Pre-processing

Remove Town Latitude and Town Longitude, as these are not applicable in the clustering analysis.

In [69]:
df = sg_town_amenities_data[['Town', 'Median Rent (SGD/month)', 'Population Density (people/km2)', 'Total Amenities']]
df.head()
Out[69]:
Town Median Rent (SGD/month) Population Density (people/km2) Total Amenities
0 ANG MO KIO 2100.0 13400.0 7
1 BEDOK 2000.0 13000.0 8
2 BISHAN 2250.0 12000.0 7
3 BUKIT BATOK 1900.0 14000.0 6
4 BUKIT MERAH 2500.0 11000.0 2
Normalizing over the standard deviation

Normalization is a statistical method that helps mathematical-based algorithms to interpret features with different magnitudes and distributions equally. StandardScaler() will be used to normalize the dataset.

In [70]:
X = df.values[:,1:]
Clus_dataSet = StandardScaler().fit_transform(X)
Clus_dataSet
/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/utils/validation.py:595: DataConversionWarning: Data with input dtype object was converted to float64 by StandardScaler.
  warnings.warn(msg, DataConversionWarning)
/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/utils/validation.py:595: DataConversionWarning: Data with input dtype object was converted to float64 by StandardScaler.
  warnings.warn(msg, DataConversionWarning)
Out[70]:
array([[ 0.08511036,  0.03198838,  0.37139068],
       [-0.30175492, -0.0424032 ,  0.83562902],
       [ 0.6654083 , -0.22838216,  0.37139068],
       [-0.68862021,  0.14357576, -0.09284767],
       [ 1.63257151, -0.41436112, -1.94980105],
       [-1.46235079,  0.32955472, -0.09284767],
       [ 1.82600416,  0.05058628, -1.02132436],
       [-1.0754855 ,  3.11923913, -0.09284767],
       [ 1.24570623, -0.63753588,  0.37139068],
       [ 0.6654083 , -0.33996954, -1.02132436],
       [-0.68862021,  0.51553368,  0.83562902],
       [-0.30175492, -1.64182227,  0.83562902],
       [-0.49518757,  0.8874916 ,  0.83562902],
       [ 1.24570623, -0.41436112, -1.94980105],
       [ 0.08511036, -0.97229801, -0.55708601],
       [-0.30175492, -0.67473167, -0.09284767],
       [-0.68862021,  0.85029581,  0.83562902],
       [ 2.0194368 , -1.64182227, -1.94980105],
       [-1.0754855 , -0.89790642, -1.02132436],
       [-0.68862021,  1.8173864 ,  0.83562902],
       [ 0.08511036, -0.32137164, -0.09284767],
       [-0.10832228, -0.15399058,  1.29986737],
       [ 0.85884094,  0.19936945, -0.09284767],
       [-1.46235079,  1.01767687,  1.76410571],
       [-1.0754855 , -0.58174219,  0.83562902]])
Determing optimal k for k-means clustering

For each k value, we will initialise k-means and use the inertia attribute to identify the sum of squared distances of samples to the nearest cluster centre

In [71]:
Sum_of_squared_distances = []
K = range(1,15)
for k in K:
    km = KMeans(n_clusters=k)
    km = km.fit(Clus_dataSet)
    Sum_of_squared_distances.append(km.inertia_)

Plot sum of squared distances for k in the range specified above. If the plot looks like an arm, then the elbow on the arm is optimal k.

In [72]:
plt.plot(K, Sum_of_squared_distances, 'bx-')
plt.xlabel('k')
plt.ylabel('Sum_of_squared_distances')
plt.title('Elbow Method For Optimal k')
plt.show()

Based on Elbow Method, optimal k = 3

Modeling

Run k-means to cluster the towns into 3 clusters

In [73]:
clusterNum = 3
k_means = KMeans(init = "k-means++", n_clusters = clusterNum, n_init = 12)
k_means.fit(X)
labels = k_means.labels_
print(labels)
[2 2 2 2 1 2 2 0 1 1 2 1 2 1 1 1 2 1 1 0 1 2 2 2 1]
Insights

Assign labels to each row in dataframe

In [74]:
sg_town_amenities_data['Clus_km'] = labels
sg_town_amenities_data['Clus_km'] = sg_town_amenities_data['Clus_km'] + 1
sg_town_amenities_data.head()
Out[74]:
Town Town Latitude Town Longitude Median Rent (SGD/month) Population Density (people/km2) Total Amenities Clus_km
0 ANG MO KIO 1.370017 103.849450 2100.0 13400.0 7 3
1 BEDOK 1.324039 103.930036 2000.0 13000.0 8 3
2 BISHAN 1.351236 103.848456 2250.0 12000.0 7 3
3 BUKIT BATOK 1.349073 103.749664 1900.0 14000.0 6 3
4 BUKIT MERAH 1.281906 103.823919 2500.0 11000.0 2 2
In [75]:
df['Clus_km'] = labels
df['Clus_km'] = df['Clus_km'] + 1
df.head()
/opt/conda/envs/Python36/lib/python3.6/site-packages/ipykernel/__main__.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':
/opt/conda/envs/Python36/lib/python3.6/site-packages/ipykernel/__main__.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app
Out[75]:
Town Median Rent (SGD/month) Population Density (people/km2) Total Amenities Clus_km
0 ANG MO KIO 2100.0 13400.0 7 3
1 BEDOK 2000.0 13000.0 8 3
2 BISHAN 2250.0 12000.0 7 3
3 BUKIT BATOK 1900.0 14000.0 6 3
4 BUKIT MERAH 2500.0 11000.0 2 2

Check centroid values by averaging features in each cluster

In [76]:
df.groupby('Clus_km').mean()
Out[76]:
Median Rent (SGD/month) Population Density (people/km2) Total Amenities
Clus_km
1 1850.000000 26500.000000 7.000000
2 2177.272727 9054.545455 4.909091
3 2025.000000 14841.666667 7.250000

Examine distribution of towns based on clusters

In [77]:
df.groupby('Clus_km').describe().transpose()
Out[77]:
Clus_km 1 2 3
Median Rent (SGD/month) count 2.000000 11.000000 12.000000
mean 1850.000000 2177.272727 2025.000000
std 70.710678 273.279743 247.257687
min 1800.000000 1800.000000 1700.000000
25% 1825.000000 2000.000000 1900.000000
50% 1850.000000 2100.000000 1975.000000
75% 1875.000000 2400.000000 2137.500000
max 1900.000000 2600.000000 2550.000000
Population Density (people/km2) count 2.000000 11.000000 12.000000
mean 26500.000000 9054.545455 14841.666667
std 4949.747468 2566.462015 2282.127529
min 23000.000000 4400.000000 12000.000000
25% 24750.000000 8200.000000 13300.000000
50% 26500.000000 9800.000000 14150.000000
75% 28250.000000 11000.000000 16450.000000
max 30000.000000 11500.000000 18700.000000
Total Amenities count 2.000000 11.000000 12.000000
mean 7.000000 4.909091 7.250000
std 1.414214 2.300198 1.602555
min 6.000000 2.000000 4.000000
25% 6.500000 3.000000 6.000000
50% 7.000000 5.000000 7.500000
75% 7.500000 6.500000 8.000000
max 8.000000 8.000000 10.000000
Visualize Clusters
In [78]:
sg_geo = r'master-plan-2019-planning-area-boundary-no-sea-geojson.geojson'

# create map of Singapore using latitude and longitude values
map_singapore_choropleth = folium.Map(location=[latitude, longitude], zoom_start=12)

# create a numpy array of length 6 and has linear spacing from the minimum to maximum population density
threshold_scale = np.linspace(sg_choropleth['Population Density (people/km2)'].min(),
                            sg_choropleth['Population Density (people/km2)'].max(),
                            6, dtype=int)
threshold_scale = threshold_scale.tolist() # change the numpy array to a list
threshold_scale[-1] = threshold_scale[-1] + 0.01 # make sure that the last value of the list is greater than the maximum population density

# add choropleth layer on map
map_singapore_choropleth.choropleth(
    geo_data=sg_geo,
    data=sg_choropleth,
    columns=['Name', 'Population Density (people/km2)'],
    key_on='feature.properties.Name',
    threshold_scale=threshold_scale,
    fill_color='YlOrRd', 
    fill_opacity=0.7, 
    line_opacity=0.2,
    legend_name='Sigapore Population Density (people/km2) by Planning Area'
)

# set color scheme for the clusters
x = np.arange(clusterNum)
ys = [i + x + (i*x)**2 for i in range(clusterNum)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to map
for lat, lng, town, median_rent, amenities, cluster in zip(sg_town_amenities_data['Town Latitude'], sg_town_amenities_data['Town Longitude'], sg_town_amenities_data['Town'],  sg_town_amenities_data['Median Rent (SGD/month)'], sg_town_amenities_data['Total Amenities'], sg_town_amenities_data['Clus_km']):
    label = '{}, SGD {} / month, {} Amenities, Cluster {}'.format(town, median_rent, amenities, cluster)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7,
        parse_html=False).add_to(map_singapore_choropleth)  

# display map
map_singapore_choropleth
Out[78]:

Cluster 1 (Purple) = High Population Density, Low Rent, Moderately Balanced Mix of Amenities
Cluster 2 (Green) = Low Population Density, Variable Rent, Variable Mix of Amenities
Cluster 3 (Red) = Moderate Population Density, Variable Rent, Most Balanced Mix of Amenities

In [79]:
df.sort_values('Clus_km', ascending=True).reset_index()
Out[79]:
index Town Median Rent (SGD/month) Population Density (people/km2) Total Amenities Clus_km
0 19 SENGKANG 1900.0 23000.0 8 1
1 7 CHOA CHU KANG 1800.0 30000.0 6 1
2 24 YISHUN 1800.0 10100.0 8 2
3 20 SERANGOON 2100.0 11500.0 6 2
4 4 BUKIT MERAH 2500.0 11000.0 2 2
5 18 SEMBAWANG 1800.0 8400.0 4 2
6 8 CLEMENTI 2400.0 9800.0 7 2
7 9 GEYLANG 2250.0 11400.0 4 2
8 17 QUEENSTOWN 2600.0 4400.0 2 2
9 11 JURONG EAST 2000.0 4400.0 8 2
10 13 KALLANG 2400.0 11000.0 2 2
11 14 MARINE PARADE 2100.0 8000.0 5 2
12 15 PASIR RIS 2000.0 9600.0 6 2
13 22 TOA PAYOH 2300.0 14300.0 6 3
14 21 TAMPINES 2050.0 12400.0 9 3
15 16 PUNGGOL 1900.0 17800.0 8 3
16 0 ANG MO KIO 2100.0 13400.0 7 3
17 10 HOUGANG 1900.0 16000.0 8 3
18 6 OUTRAM 2550.0 13500.0 4 3
19 5 BUKIT PANJANG 1700.0 15000.0 6 3
20 3 BUKIT BATOK 1900.0 14000.0 6 3
21 2 BISHAN 2250.0 12000.0 7 3
22 1 BEDOK 2000.0 13000.0 8 3
23 23 WOODLANDS 1700.0 18700.0 10 3
24 12 JURONG WEST 1950.0 18000.0 8 3
In [ ]: